Fixing the Embarrassing Slowness of OpenDHT on PlanetLab
نویسندگان
چکیده
The distributed hash table, or DHT, is a distributed system that provides a traditional hash table’s simple put/get interface using a peer-to-peer overlay network. To echo the prevailing hype, DHTs deliver incremental scalability in the number of nodes, extremely high availability of data, low latency, and high throughput. Over the past 16 months, we have run a public DHT service called OpenDHT [14] on PlanetLab [2], allowing any networked host to perform puts and gets over an RPC interface. We built OpenDHT on Bamboo [13] and shamelessly adopted other techniques from the literature— including recursive routing, proximity neighbor selection, and server selection—in attempt to deliver good performance. Still, our most persistent complaint from actual and potential users remained, “It’s just not fast enough!” Specifically, while the long-term median latency of gets in OpenDHT was just under 200 ms—matching the best performance reported for DHASH [5] on PlanetLab—the 99th percentile was measured in seconds, and even the median rose above half a second for short periods. Unsurprisingly, the long tail of this distribution was caused by a few arbitrarily slow nodes. We have observed disk reads that take tens of seconds, computations that take hundreds of times longer to perform at some times than others, and internode ping times well over a second. We were thus tempted to blame our performance woes on PlanetLab (a popular pastime in distributed systems these days), but this excuse was problematic for two reasons. First, peer-to-peer systems are supposed to capitalize on existing resources not necessarily dedicated to the system, and do so without extensive management by trained operators. In contrast to managed, cluster-based services supported by extensive advertising revenue, peer-to-peer systems were supposed to bring power to the people, even those with flaky machines. Second, it is not clear that the problem of slow nodes is limited to PlanetLab. For example, the best DHASH performance on the RON testbed, which is smaller and less loaded than PlanetLab, still shows a 99th percentile get latency of over a second [5]. Furthermore, it is well known that even in a managed cluster the distribution of individual machines’ performance is long-tailed. The performance of Google’s MapReduce system, for example, was improved by 31% when it was modified to account for a few slow machines its designers called “stragglers” [6]. While PlanetLab’s performance is clearly worsened by the fact that it is heavily shared, the current trend towards utility computing indicates that such sharing may be common in future service infrastructures. It also seems unlikely that one could “cherry pick” a set of well-performing hosts for OpenDHT. The MapReduce designers, for example, found that a machine could suddenly become a straggler for a number of reasons, including cluster scheduling conflicts, a partially failed hard disk, or a botched automatic software upgrade. Also, as we show in Section 2, the set of slow nodes isn’t constant on PlanetLab or RON. For example, while the 90% of the time it takes under 10 ms to read a random 1 kB disk block on PlanetLab, over a period only 50 hours, 235 of 259 hosts will take over 500 ms to do so at least once. While one can find a set of fast nodes for a short experiment, it is nearly impossible to find such a set on which to host a long-running service. We thus adopt the position that the best solution to the problem of slow nodes is to modify our algorithms to account for them automatically. Using a combination of delay-aware routing and a moderate amount of redundancy, our best technique reduces the median latency of get operations to 51 ms and the 99th percentile to 387 ms, a tremendous improvement over our original algorithm. In the next section we quantify the problem of slow nodes on both PlanetLab and RON. Then, in Sections 3 and 4, we describe several algorithms for mitigating the effects of slow nodes on end-to-end get latency and show their effectiveness in an OpenDHT deployment of approximately 300 PlanetLab nodes. We conclude in Section 5.
منابع مشابه
A Study on the Effect of Embarrassing Mechanism in Parents-Children Relationship and its Role in Deviational Behavior in the Youth (Based on Theoretical Model of Braithwaite)
Moral deviation and delinquency in the youth life do not appear in one form and have different and diverse meanings and concepts. Not only criminal or guilty youth are the ones who propel the society to continuous thinking and planning and trying for their readjustment to the society, but also there are many youths who have not done any guilt but are exposed to the danger of moral deviation. S...
متن کاملEffect of Bio-Fertilizers Containing Nitrogen and Phosphorus Fixing Bacteria on Yield and Yield Components of Faba Bean
This study set out to investigate the effect of bio-fertilizers containing nitrogen-fixing bacteria and phosphorus on grain yield and yield components of Faba bean. To this end a factorial experiment on the basis of randomized complete block design with four replications was conducted in Behbahan region at South eastern part of Khuzestan Province of Iran. The experimental treatments included Ni...
متن کاملOn the fixed number of graphs
A set of vertices $S$ of a graph $G$ is called a fixing set of $G$, if only the trivial automorphism of $G$ fixes every vertex in $S$. The fixing number of a graph is the smallest cardinality of a fixing set. The fixed number of a graph $G$ is the minimum $k$, such that every $k$-set of vertices of $G$ is a fixing set of $G$. A graph $G$ is called a $k$-fixed graph, if its fix...
متن کاملON ABSOLUTE CENTRAL AUTOMORPHISMS FIXING THE CENTER ELEMENTWISE
Let G be a finite p-group and let Aut_l(G) be the group of absolute central automorphisms of G. In this paper we give necessary and sufficient condition on G such that each absolute central automorphism of G fixes the centre element-wise. Also we classify all groups of orders p^3 and p^4 whose absolute central automorphisms fix the centre element-wise.
متن کاملInfluence of Manure Application and Nitrogen Fixing Bacteria on Yield and Yield Components of Black Cumin (Nigella Sativa L.)
The main objective of this study was to determine the effects of Nitrogen fixing bacteria and manure application on the seed yield and yield components in black cumin (Nigella sativa L.). The experiment was carried out at the RAN Research Station in Firouzkouh in 2012. A 4×4 factorial experiment, arranged in a randomized complete blocks designed with three replications. The treatments ...
متن کامل